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We study the price dynamics of stocks traded in a financial market by considering the statistical 
properties both of a single time series and of an ensemble of stocks traded simultaneously. We use 
the n stocks traded in the New York Stock Exchange to form a statistical ensemble of daily stock 
returns. For each trading day of our database, we study the ensemble return distribution. We find 
that a typical ensemble return distribution exists in most of the trading days with the exception of 
crash and rally days and of the days subsequent to these extreme events. We analyze each ensemble 
return distribution by extracting its first two central moments. We observe that these moments are 
fluctuating in time and are stochastic processes themselves. We characterize the statistical properties 
of ensemble return distribution central moments by investigating their probability density functions 
and temporal correlation properties. In general, time-averaged and portfolio-averaged price returns 
have different statistical properties. We infer from these differences information about the relative 
strength of correlation between stocks and between different trading days. Lastly, we compare our 
empirical results with those predicted by the single-index model and we conclude that this simple 
model is unable to explain the statistical properties of the second moment of the ensemble return 
distribution. 

PACS: 05.40.-a, 89.90.+n 
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I. INTRODUCTION 

In recent years physicists started to interact with 
economists to concur to the modeling of financial markets 
as model complex systems pj. This triggered the inter- 
est of a group of physicists into the analysis and model- 
ing of price dynamics in financial markets performed by 
using paradigms and tools of statistical and theoretical 
physics 0. One target of these researches is to imple- 
ment a stochastic model of price dynamics in financial 
markets which reproduces the statistical properties ob- 
served in the time evolution of stock prices. In the last 
few years physicists interested in financial analysis have 
performed several empirical researches investigating the 
statistical properties of stock price and volatility time 
series of a single stock (or of an index) at different tem- 
poral horizons HQ]. Such a kind of analysis does not 
take into account any interaction of the considered fi- 
nancial stock with other stocks traded simultaneously in 
the same market. It is known that the synchronous price 
returns time series of different stocks are pair correlated 
||[| and several researches has been performed also by 
physicists in order to extract information from the cor- 
relation properties A precise characterization of 
collective movements in a financial market is of key im- 
portance in understanding the market dynamics and in 
controlling the risk associated to a portfolio of stocks. 
The present study contributes to the understanding of 
collective behavior of a portfolio of stocks in normal and 
extreme days of market activity. 

Specifically, we address the question: Is the complexity 



of a financial market essentially limited to the statistical 
behavior of each financial time series or rather a complex- 
ity of the overall market exists? To answer this question, 
we present the results of an empirical analysis performed 
adopting the following point of view. We investigate the 
price returns of an ensemble of n stocks simultaneously 
traded in a financial market at a given day. With this ap- 
proach we quantify what we call the variety of a financial 
market at a given trading day |ic| . The variety provides 
statistical information about the amount of different be- 
havior observed in stock return in a given ensemble of 
stocks at a given trading time horizon (in the present 
case, one trading day). We observe that the distribution 
of variety is sensitive to the composition of the portfolio 
investigated (especially to the capitalization of the con- 
sidered stocks). 

The return distribution shows a typical shape for most 
of the trading days. However, the typical behavior is not 
observed during crash and rally days. The shape and pa- 
rameters characterizing the ensemble return distribution 
are relatively stable during normal phases of the mar- 
ket activity while become time dependent in the periods 
subsequent to crashes. The variety is characterized by 
a long-range correlated memory showing that no typi- 
cal time scale can be expected after a rally or a crash 
for the expected relaxation to a "normal" market phase. 
Moreover a simple model such as the single-index model 
is not able to reproduce the statistical properties empir- 
ically observed. 

The paper is organized as follow. In Section II we illus- 
trate our database and the ensemble of stocks considered. 
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Sect. Ill is devoted to the investigation of the statisti- 
cal properties of the time evolution of each single stock. 
In Section IV, we discuss the statistical properties of en- 
semble return distribution. Specifically we consider the 
behavior of the central lowest moments, their distribu- 
tion and correlation, a comparison of time and portfolio 
average, and the role of the size and homogeneity of the 
investigated portfolio. In Section V we compare the sta- 
tistical properties observed in a real financial market with 
the prediction of the single-index model. In Section VI 
we present a discussion of the obtained results. 



II. DATABASE AND INVESTIGATED 
VARIABLES 

The investigated market is the New York Stock Ex- 
change (NYSE) during the 12-year period from January 
1987 to December 1998 which corresponds to 3032 trad- 
ing days. We consider the ensemble of all stocks traded 
in the NYSE. The number of stocks traded in the NYSE 
is increasing in the investigated period and it ranges from 
1128 at the beginning of 1987 to 2788 at the end of 1998. 
The total number of data records exceeds 6 millions. 

The variable investigated in our analysis is the daily 
price return, which is defined as 



Ri(t) 



Yi(t) 



(1) 



where Yi(t) is the closure price of i— th stock at day t 
(t = 1,2, ..). For each trading day t, we consider n re- 
turns, where n is depending on the total number of stocks 
traded in the NYSE at the selected day t. In our study 
we use a "market time" . With this choice, we consider 
only the trading days and we remove the weekends and 
the holidays from the calendar time. 

A database of more than 6 millions records unavoid- 
ably contains some errors. A direct control of a so large 
database is not realistic. For this reason, to avoid spu- 
rious results we filter the data by not considering price 
returns which are in absolute values greater than 50%. 

The companies traded in the NYSE are quite different 
the one from the other. Differences among the compa- 
nies are observed both with respect to the sector of their 
economic interests and with respect to their size. One 
measure of the size of a company is its capitalization. 
The capitalization of a stock is the stock price times the 
number of outstanding shares. In this study, we discuss 
the role of the different capitalization in the price dynam- 
ics. 



III. SINGLE STOCK PROPERTIES 

The distribution of returns with different time horizons 
of a single _stock or index has been studied by several 
authors 



The stocks traded in a financial market have different 
capitalization. An important point is whether the dif- 
ferences in capitalization are reflected in the statistical 
properties of the price returns of the stocks. To answer 
this question we investigate the distribution of daily re- 
turns of 2188 stocks traded in the NYSE at an arbitrarily 
chosen day that we select as June 10th, 1996. 




FIG. 1. Surface plot of the logarithm of the probability 
density function of normalized daily returns (Ri(t) — Hi) /en 
of all the stocks traded in the NYSE. The stocks are sorted 
according to their capitalization at June 10th, 1996. 

We compare the statistical properties of daily price 
return distribution of each stock as a function of its cap- 
italization. We order the 2188 stocks in decreasing order 
according to their capitalization at June 10th, 1996. Our 
ordering procedure gives to the most capitalized stock 
(the General Electric Co., GE) the rank i = 1, to the sec- 
ond one (the Coca Cola Company) the rank i — 2, and 
so on. An analysis of the return probability density func- 
tion (pdf) for the 2188 stocks shows that the distributions 
are different. This is due in general to: (i) different scale 
and (ii) different shape of the return pdfs. In order to 
eliminate one source of difference we analyze the pdf of 
the normalized returns (Ri(t) — /Zj) / Oi (i = 1, 2, 2188), 
where ^ and Oi are the first two central moments of the 
time series Ri(t) defined as 



1 Ti 



\ 



(2) 
(3) 



where Tj is the number of trading days of the stock i dur- 
ing the investigated period. The quantity gives a mea- 
sure of the overall performance of stock i in the period. 
The standard deviation cr, is called historical volatility in 
the financial literature and quantifies the risk associated 
with the i-th stock. This quantity is of primary impor- 
tance in risk management and in option pricing. 

The pdf of normalized daily returns of all the stocks 
ordered by capitalization is shown in Fig. 1. The central 
part of the distribution of the most capitalized stocks has 
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FIG. 2. Each circle represents the h parameter defined 
in Eq. (4) of the daily return distribution of a stock as a 
function of its capitalization. The dashed line is the value 
y/2/TT ~ 0.80 which is the lower bound for Kg expected for 
a Gaussian distribution of daily return. Values of h smaller 
than he indicate a leptokurtic distribution of returns. The 
parameter h slowly increases by increasing the capitalization. 

a bell-shaped profile. Moving towards less capitalized 
stocks the central part of the distribution becomes more 
peaked and the tails of the distribution become fatter. 
The pdf of the less capitalized stocks is therefore more 
leptokurtic than the pdf of the more capitalized ones. 

The typical estimation of the degree of leptokurtosis of 
a pdf is done by considering its kurtosis. The evaluation 
of the kurtosis of the pdf is in general difficult for small 
set of data because the fourth moment and all the mo- 
ments higher than the second are extremely sensible to 
the highest absolute returns. This implies that the kur- 
tosis calculated from a relatively small set of records is 
dominated by the highest absolute returns rather than by 
the shape of the pdf and therefore it is not a good statis- 
tical estimation. To avoid this problem, we quantify the 
distance between the empirically calculated pdf of daily 
returns of i— th stock and the Gaussian distribution by 
considering the quantity 



\/< x 2 > - < x > 2 

The quantity h is nondimensional and depends on the 
first two moments. For the Gaussian distribution 

the parameter h is equal to 

^(-H|> + ^M7&)) <6) 

The parameter he is a function of the ratio /iq I&G rang- 
ing from the lower bound s/2/tt when /j.g/cg — to 
infinity. 




FIG. 3. Surface plot of the logarithm of the ensemble re- 
turn distribution for the 12-year investigated period from Jan- 
uary 1987 to December 1998. From the Figure is clearly rec- 
ognizable the 1987 crash (trading day index equal to 200) and 
the high volatility two-year period 1997-1998 (trading day in- 
dex from 2500 to 3032). 

For a leptokurtic pdf, as for example a Laplace distri- 
bution or a Student's t-distribution with finite variance, 
h is always smaller than Kq . The distance of h from Kq is 
able to quantify the degree of leptokurtosis of the consid- 
ered pdf. Figure 2 shows the parameter h for the stocks 
traded in the NYSE as a function of their capitalization. 
In the figure, we show also the lower bound of Hq for 
comparison. The empirically calculated parameter h is 
systematically smaller than he- The mean value < h > 
of the overall market is < h >= 0.67 and its standard 
deviation is ah = 0.06. Hence this result suggests that 
as a first approximation one can assume that the large 
majority of stocks are characterized by a roughly similar 
pdf. However we wish to point out that this conclusion 
is only valid as a first approximation because a trend of 
h is clearly detected in Fig. 2. Specifically h increases as 
the capitalization increases. Therefore the less capital- 
ized stocks have a more leptokurtic daily return pdf than 
the more capitalized ones. 

The second moment of return distribution has been 
found finite in recent research |i"l]~|i"4[]. In order to ver- 
ify the convergence of the pdf towards a Gaussian pdf 
at large temporal horizons, we evaluate the h parame- 
ter for weekly < h w > and monthly < h m > return 
pdfs. We obtain from our analysis < h w > = 0.70 and 
< h m >— 0.74. These resul ts show that the values of h 
moves towards Kq = y/2/w ~ 0.80 when the time hori- 
zon of returns is increased, supporting the conclusion of 
finite second moment. 



IV. ENSEMBLE RETURN DISTRIBUTION 

In the previous section we focused on statistical prop- 
erties of time evolution of price returns for each single 
stocks traded in the NYSE. In this section we perform a 
synchronous analysis on the return of all the stock traded 
in the NYSE. To this aim we extract the n returns of the 
n stocks for each trading day t. 
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FIG. 4. Contour plot of the logarithm of the ensemble re- 
turn distribution for the 12-year investigated period from Jan- 
uary 1987 to December 1998 (same data as in Fig. 3). The 
contour plot is obtained for equidistant intervals of the loga- 
rithmic probability density. The brightest area of the contour 
plot corresponds to the most probable value. 

The distribution of these returns Pt(R) provides infor- 
mation about the kind of activity occurring in the market 
at the selected trading day t. 

Figure 3 shows the logarithm of the pdf as a function 
of the return and of the trading day. In this figure we 
show the interval of daily returns from —25% to 25%. 
The central part of the distribution is roughly triangular 
in a logarithmic scale and this shape and its scale are 
conserved for long time periods. Sometimes the shape 
and scale of the ensemble return pdf changes abruptly 
either in the presence of large average positive returns or 
large average negative returns. Figure 4 shows the same 
data of Fig. 3 in a contour plot. The contour lines de- 
scribe equiprobability regions. In order to point out the 
properties of the central part of the distribution, in Fig. 
4 we plot only the returns which are less than 15% in 
absolute value. Only a few points of the contour lines 
fall behind this limit during the 1987 and 1998 crises. 
In Fig. 4 there are long time periods in which the cen- 
tral part of the distribution maintains his shape and the 
equiprobability contour lines are approximately parallel 
one to each other. As an example, one can consider the 



three-year period 1993-1995. 
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FIG. 5. Ratio between h parameter defined in Eq. (4) of 
the ensemble return distribution and the value of /ig expected 
by a Gaussian distribution and defined by Eq. (6) for each 
trading day. The ratio h/ha is systematically smaller than 
one, indicating that the ensemble return distribution is lep- 
tokurtic for each trading day. 

On the other hand there are time periods in which the 
shape of the distribution changes drastically. In general 
these periods corresponds to financial turmoil in the mar- 
ket. For example a dramatic change of the shape and of 
the scale of the pdf is observed in Fig. 4 during and after 
the 19 Oct. 1987 crash, at the beginning of 1991 and at 
the end of 1998. A systematic analysis of the change of 
the shape and scale of the ensemble return distribution 
during extreme events of the market has been discussed 
elsewhere 

One key aspect of the ensemble return distribution con- 
cerns its shape during the normal periods of activity of 
the market. Is the distribution approximately Gaussian 
or systematic deviation from a Gaussian shape are quan- 
titatively observed? We already cited that a direct in- 
spection of Fig. 3 suggests that the central part of the 
empirical return distribution is roughly Laplacian (tri- 
angular in a logarithmic scale) and not Gaussian. To 
make this analysis more quantitative, we show in Fig. 
5 the ratio between the value of h determined for each 
trading day from the ensemble return distribution and 
the quantities ha calculated by determining the mean 
and the standard deviation of Pt{R) and hypothesizing 
a Gaussian shape by using Eq. (6). The ratio h/hc is 
systematically smaller than one and this implies that the 
Gaussian hypothesis for the shape of the distribution is 
not verified by the empirical analysis. In other words the 
Gaussian distribution is not a good approximation both 
for the central part and for the tails of the distribution 
and the deviation from the Gaussian behavior is system- 
atically observed for all the trading days of the 12 years 
time period analyzed in our study. 

In summary the ensemble return distribution well char- 
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acterizes the market activity. It has a typical shape and 
scale during long periods of "normal" activity of the 
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FIG. 6. Linear-log plot of the probability density function 
of the mean u(t) of the ensemble return distribution (white 
diamond) and of the mean of the daily return ui of all the 
stocks traded in the NYSE (black square). 

market characterized by moderately low average daily 
return. During extreme events the shape and scale are 
dramatically changed in a systematic way. Specifically 
during crises the ensemble return distribution becomes 
negatively skewed whereas during rallies a positive skew- 
ness is observed Q . Figure 4 clearly shows that extreme 
events (such as for example October 87 crash) triggers an 
"aftershock" period, in the ensemble return pdf, that can 
last for a period of time of several months. 



value of a(t) indicates that different companies are char- 
acterized by rather different returns at day f. In fact in 
days of high variety some companies perform great gains 
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FIG. 7. Log-log plot of the probability density function of 
the variety a(t) , i.e. the variance of the ensemble return 
distribution (white diamond) and of the volatility en, i.e. the 
variance of the daily return, of the all the stocks traded in the 
NYSE (black square). 

whereas others have great losses. The mean and the 
standard deviation of price returns are not constant and 
fluctuate in time. We study the temporal series of //(f) 
and <r(f) in order to characterize the temporal evolution 
of the ensemble return distribution quantitatively. We 
investigate these fluctuating parameters by investigating 
their time correlation properties and their pdfs. 



A. Central moments 



B. Probability distributions of the central moments 



In order to characterize more quantitatively the en- 
semble return distribution at day f, we extract the first 
two central moments at each of the 3032 trading days. 
Specifically, we consider the average and the standard 
deviation defined as 



1 n * 

H(t) = — Vi2*(t), 
a(t) 



\ 



^f£(iM*)-M*)A 



(7) 
(8) 



where n t indicates the number of stocks traded at day f. 

The mean of price returns //(f) quantifies the general 
trend of the market at day f. The standard deviation 
a(t) gives a measure of the width of the ensemble return 
distribution. We call this quantity variety of the ensem- 
ble because it gives a measure of the variety of behavior 
observed in a financial market at a given day. A large 



The empirical pdf of the mean /t(f) for the 3032 trading 
days investigated is shown in Fig. 6. The central part of 
this distribution is non-Gaussian and is roughly described 
by a Laplace distribution. 

The mean //(f) is proportional to the sum of n random 
variables Ri(t) (i — 1,2, ...,n). The Central Limit The- 
orem prescribes that the sum of n independent random 
variables with finite variance converges to a Gaussian pdf. 
By assuming a finite value for the volatility of stocks, the 
observation that the pdf of the mean return //(f) is non- 
Gaussian can be therefore attributed to the presence of 
correlation between the stocks. 

Figure 7 shows the pdf of the variety <r(t). The central 
part of this distribution is approximated by a lognormal 
distribution. A deviation from the lognormal behavior 
is observed in the tail of higher values of variety. This 
deviation is depending on the size of the portfolio and 
will be discussed in subsection IV E. 
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C. Correlations in the central moments 



< fii >i=< /z(t) > t = fi, 



(11) 



Another important statistical property of fj,(t) and a{t) 
concerns their correlation properties |Hj . For the consid- 
ered portfolio, we calculate the autocorrelation function 
of a variable x(t) which is defined as 

= < x(t)x(t + r) > - < x(t) ><x(t + r)> 
{T >- < x(t) 2 > - < x(t) > 2 ■ 1 ' 

In agreement with previous results po| , we find that 
the mean fj,(t) is approximately delta correlated, whereas 
the autocorrelation function of a{t) is long-range corre- 
lated. The empirical autocorrelation function of a(t) is 
well approximated by a power-law function R(t) cx t~ a 
. By performing a best fit with a maximum time lag 
of 50 trading days, we determine the exponent S — 
0.230 ± 0.006. This result indicates that the variety a(t) 
has a long-time memory in the market. We recall that the 
historical volatility is characterized by long time memory 
of the same nature E^-Q. 

Another way to investigate the long-range correlation 
is to determine the power spectrum of the investigated 
variable. We evaluate the power spectrum of a(t) and we 
perform a best fit of the power spectrum with a functional 
form of the kind 

S(f) oc i (10) 

Our best fit for the power spectrum of a(t) gives for the 
exponent rj w 1.1. This result confirms that the variety 
a(t) is a long-range time correlated random variable. 

D. Time and portfolio average 

Figure 6 shows two curves. In fact in Fig. 6 we also 
show the pdf of the mean fj,j . The quantity \ii (see Eq. 
(2)) is the mean return of stock i averaged over the in- 
vestigated time interval. The pdf of ^ is non-Gaussian 
and it is much more peaked than the pdf of fJ.(t). Hence 
the statistical behavior observed by investigating a large 
portfolio in a market day is not representative of the sta- 
tistical behavior observed by investigating the time evo- 
lution of single stocks. 

This comparison can be performed also for the second 
moment of the distributions. In Fig. 7 we compare the 
pdf of the volatility <ii and the pdf of the variety a{t). 
Also in this case, the statistical properties of <7j and a(t) 
are different. Specifically, the pdf of a(t) is more peaked 
than the pdf of <x; . 

In order to understand the different behavior of the 
time-averaged and the portfolio-averaged quantities, for 
the sake of simplicity, we consider a portfolio composed 
by N stocks which are traded in a period of T trading 
days. We first study the properties of the two means, Hi 
and [i(t). It is straightforward to verify that 



where < .. >t indicates temporal average and < .. >; 
indicates ensemble average. The variances of /x^ and fj,(t) 
are in general different. We obtain for the variance of 
the expression 

1 T N N 

VarW)\t = f £>(*) "rf^EE 4> ( 12 ) 

t=l i=l j=l 

where erf a is the return covariance between stock i and j 
defined as 

4- =< R l {t)R ] {t) > t -< Ri(t) > t < Rj(t) > t . (13) 

The width of the pdf of n(t) (shown in Fig. 6) is the 
square root of Var[fi(t)] t . Equations (12) and (13) in- 
dicate that this quantity depends both on the ensemble 
averaged square volatility (terms with i = j in Eq. (12)) 
and on the mean of the synchronous cross-covariances 
between pairs of stocks (terms with i ^ j in Eq. (12)). 

With similar methods we show that the variance of \ii 
can be written as 

^ N T T 

Var^ = — ~ V) 2 = J2 ^t*'' ( 14 ) 

i=l t=i v = \ 

where we define the return covariance between trading 
day t and t' as 

a 2 tt , =< RityRiQ!) >i ~ < Ri{t) > l < Ri{t') >, . (15) 

This quantity gives an estimate of the correlation present 
in the whole portfolio at trading day t and t' . The double 
sum in Eq. (14) can be split in a term depending on the 
average square variety (t = t') and in a term depending 
on the correlation between different trading days (t =/= t'). 

We verify that the average square variance and volatil- 
ity satisfy the sum rule 

Var\jH]i+ < af > 2 = Var[^i{t)] t + < a 2 {t) > t . (16) 

Combining Eq.s (12), (14) and (16) we show that 

N 

^<- 2 w>*+JiEE4= d7) 

3=1 i<j 
T 

t=i t'<t 

Since N,T » 1, we approximate (N — 1) /N = (T — 
1)/TS1 and Eq. (17) becomes 

< a\ >i - < a 2 {t) > t =< a% - < a 2 tt , > t#ts (18) 

or equivalcntly 

Var[pL(t)] t - Var[^} z =< a 2 } > i#J - < a 2 tt , > t/t , . (19) 



6 



Q - 
CL 

Ul 
O 



(a) ^ 






TO 








(c) 



-2.5 




1.5 -2.5 

log(o(t)) 




FIG. 8. Log-log plot of the probability density function of 
the variety a(t) for the four considered ensemble of stocks, (a) 
DJIA30, (b) SP100, (c) SP500, (d) NYSE. The solid lines are 
our best fit of the central part of the distribution according 
to a lognormal distribution. 

Figure 6 shows that Var[fi(t)]t > Var[/j,i]i. This em- 
pirical observation together with the last relation tell 
us that the synchronous cross-correlations between the 
stocks are on average stronger than the single stock cor- 
relation present in the whole portfolio at two different 
trading day. This result is consistent with previous obser- 
vations that synchronous returns of different stocks are 
significantly cross-correlated ^-§1, whereas single price 
returns are poorly autocorrelated in time. This conclu- 
sion is also verified by our empirical observation that 
< of >i>< <J 2 {t) > t . 



E. Portfolio size 

One key aspect of the previous results concerns the 
degree of generality of the observed stylized facts. In 
other words, are the empirical properties of the variety 
depending on the considered portfolio? In Section II wc 
have shown that all the stocks are not equivalent with 
respect to their statistical properties (see the spread of 
points observed in Fig. 2). In fact a trend is observed in 
the degree of non-Gaussian shape of the return distribu- 
tion as a function of the stock capitalization. 

To test the degree of sensitivity of our results to the 
average capitalization of the selected portfolio, we repeat 
the analysis presented in subsection III.B for three other 
portfolios of stocks traded in the NYSE. Specifically we 
investigate: (a) the set of 30 stocks used to compute the 
Dow Jones Industrial Average index; (b) the set of stocks 
traded in the NYSE and used to compute the Standard 
& Poor's 100 index; and (c) the set of stocks traded in 
the NYSE and used to compute the Standard & Poor's 
500 index. The results obtained for all the stocks traded 
in the NYSE are also considered for reference. The four 



sets are different with respect to two aspects. They differ 
for the number of stocks present in the set and for the 
average capitalization of the considered stocks. The em- 
pirical pdfs of n{t) for the four considered sets are roughly 
the same. An evident different behavior is observed for 
the variety. In Fig. 8 we show the pdf of the variety 
of the considered portfolios of stocks. Specifically panels 
(a), (b), (c) and (d) of Fig. 8 are the results obtained 
for the Dow Jones 30, Standard & Poor's 100, Standard 
& Poor's 500 and NYSE sets of stocks, respectively. By 
moving from the smallest to the largest portfolio of stocks 
two effects take place. The pdf of the variety becomes 
progressively sharper and deviates more from a lognor- 
mal profile. The fact that the pdf of the variety becomes 
progressively sharper is probably due to the fact the num- 
ber of elements in the considered set increases whereas 
we interpret the progressive deviation from the lognor- 
mal profile as a direct manifestation of the progressive 
increases of the degree of inhomogeneity of the portfolio 
of stocks. 

In summary the presence of inhomogeneity in capi- 
talization in the portfolio of stocks affects the statistical 
properties of the variety of the portfolio. This fact should 
be kept in mind when results about the variety such as 
results about other statistical properties included return 
distribution are obtained by considering the statistical 
properties of a set of inhomogeneous stocks. 



V. SINGLE-INDEX MODEL 

In this section we compare the results of our empir- 
ical analysis obtained for the NYSE portfolio of stocks 
with the results obtained by modeling the stock price 
dynamics with the single-index model. The single-index 
model [^|,^| is a basic model of price dynamics in finan- 
cial markets. It assumes that the returns of all stocks are 
controlled by one factor, usually called the "market" . In 
this model, for any stock i we have 



Ri(t)=ai+PiRM(t) + ei(t), 



(20) 



where Ri(t) and Rm(1) are the return of the stock i 
and of the "market" at day t, respectively, a« and Pi 
are two real parameters and tiit) is a zero mean noise 
term characterized by a variance equal to of. . The noise 
terms of different stocks are assumed to be uncorrelated, 
< Ci(t)ej(t) >t= for i ^ j. Moreover the covariance 
between Rmit) and e^(i) is set to zero for any i. 

Each stock is correlated with the market and the pres- 
ence of such a correlation induces a correlation between 
any pair of stocks. It is customary to adopt a broad- 
based stock index for the market RM{t)- Our choice for 
the "market" time series is the Standard and Poor's 500 
index. The best estimation of the model parameters at, 
Pi and of . is done with the ordinary least squares method 
H . In order to compare our empirical results with those 
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FIG. 9. (a) Time series of the mean of the ensemble return 
distribution fi(t). (b) Time series of the mean of the ensemble 
return distribution for the surrogate data generated according 
to the single-index model, (c) Time series of the variety a(t) 
of the ensemble return distribution, (d) Time series of the 
variety of the ensemble return distribution for the surrogate 
data generated according to the single-index model. 

predicted by the single-index model we build up an ar- 
tificial market according to Eq. (20). To this end we first 
evaluate the model parameters for all the stocks traded in 
the NYSE and then we generate a set of n of surrogate 
time series according to Eq. (20). To make the simu- 
lation as realistic as possible, in the generation of our 
surrogate data set we use as "market" time series the 
true time series of the Standard and Poor's 500 index. 

We evaluate the central moments fj,(t) and a(t) de- 
fined in Eqs (7-8) for the surrogate data. In Fig. 9(a) we 
show the time series of /z(i) of the real data and in Fig. 
9(b) we show the same quantity for the surrogate market 
data generated according to the single-index model. The 
agreement between the two time series is pretty high and 
therefore the single-index model describes quite well the 
mean returns of the market at time t provided that the 
behavior of the "market" Rm (t) is known . This result is 
also confirmed by Fig. 10 where the pdf of /i(t) for real 
and surrogate data are shown. Also the time correlation 
properties of surrogate n(t) are pretty similar to the real 
ones. In fact, a fast decaying autocorrelation function of 
n(t) is observed in surrogate data. A good agreement is 
also observed when one investigates the statistical prop- 
erties of fii and <7j. The single- index model approximates 
quite well the empirical distribution of m and <7j. 

A different behavior is observed for the variety a(t). 
Figure 9(c) and 9(d) show the time series of a(t) for real 
and surrogate data, respectively. The real time series of 
the variety is non stationary and shows several bursts 
of activity. On the contrary the surrogate time series is 
quite stationary with the exception of the 1987 crash. 
Figure 11 shows the pdfs of a(t) for real and surrogate 
data. The model fails in describing the distribution of 
a(t). 
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FIG. 10. Comparison of the probability density func- 
tion of the mean /j,(t) of the ensemble return distribu- 
tion obtained from real (diamond) with the one obtained 
from surrogate data generated according to the single-index 
model(continuous line). 
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FIG. 11. Comparison of the probability density function 
of the variety a(t) obtained from real (diamond) with the 
one obtained from surrogate data generated according to the 
single-index model (continuous line). 

In summary, the single-index model gives a good ap- 
proximation of the statistical behavior of fj,(t), in and 
<ji whereas it describes poorly the statistical behavior of 
the variety of a portfolio of stocks traded in a financial 
market. This conclusion is also supported by the obser- 
vation that the autocorrelation function of the surrogate 
variety decays in 2 — 3 trading days to the value 0.1 and 
the power spectrum is very similar to a white noise spec- 
trum, whereas long-range correlation is observed in real 
data. 

A more refined analysis shows that the artificial ensem- 
ble return distribution is systematically less leptokurtic 
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than the real one. Moreover, in Ref. [Gl| we show that 
the single-index model is unable to predict the change in 
the symmetry properties of the ensemble return distri- 
bution in crash and rally days. The differences observed 
between the behavior of real data and the behavior of 
surrogate data suggest that the correlations among the 
stocks can be explained by the single-index model only 
for "normal" periods in first approximation whereas the 
model miss completely to reproduce the correlation be- 
havior during extreme events. 

VI. CONCLUSIONS 



portfolio and (ii) the statics and dynamics of the corre- 
lations existing between stocks. 
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The present study shows that one needs to consider 
not only the statistical properties characterizing the time 
evolution of price for each stock traded but also the syn- 
chronous collective behavior of the portfolio considered 
to reveal the overall complexity of a financial market. We 
show that such a collective behavior of a portfolio of stock 
is efficiently monitored by the variety of the ensemble re- 
turn distribution. This variable is directly observable for 
each portfolio and presents interesting statistical proper- 
ties. It is non-Gaussian distributed and long-range cor- 
related. The detailed statistical properties depends on 
the considered portfolio of stocks. We verify that for a 
portfolio of stocks characterized by comparable capital- 
ization the distribution of the variety is approximately 
lognormal. Deviation from the lognormal behavior are 
observed for less homogeneous (in capitalization) portfo- 
lios. 

The shape of the distribution and the long-term mem- 
ory of the variety are not reproduced by considering sur- 
rogated data simulated by using a single-index model 
with a realistic time series for the "market". This im- 
plies that the complexity detected by the performed em- 
pirical analysis cannot be modeled with a similar simple 
stock price model. The correlations present in the mar- 
ket are more complex than the ones hypothesized by the 
single-index model. 

The correct modeling of the statistical properties of the 
variety can be then used as a benchmark for stock price 
models more sophisticated than the single-index model. 

The ensemble return distribution shows a qualitatively 
and quantitatively different behavior in "normal" and ex- 
treme trading days. The variety of a portfolio is then able 
to detect quite clearly shocks and aftershocks occurring 
in the market. Hence, it is a promising direct observable 
able to measure how much a portfolio is under pressure 
and how distant is from the typical market activity in a 
specific trading day. A theoretical challenge is to relate 
this empirical ensemble observation directly with the cor- 
relations active between pairs of stocks of a correlation. 

In summary, we believe that the overall complexity of 
a financial market can be detected and modeled only by 
considering simultaneously - (i) the statistical properties 
of the time evolution of stock prices of the considered 
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